Deep learning models for predicting the survival of patients with medulloblastoma based on a surveillance, epidemiology, and end results analysis

Medulloblastoma is a malignant neuroepithelial tumor of the central nervous system. Accurate prediction of prognosis is essential for therapeutic decisions in medulloblastoma patients. We analyzed data from 2,322 medulloblastoma patients using the SEER database and randomly divided the dataset into training and testing datasets in a 7:3 ratio. We chose three models to build, one based on neural networks (DeepSurv), one based on ensemble learning that Random Survival Forest (RSF), and a typical Cox Proportional-hazards (CoxPH) model. The DeepSurv model outperformed the RSF and classic CoxPH models with C-indexes of 0.751 and 0.763 for the training and test datasets. Additionally, the DeepSurv model showed better accuracy in predicting 1-, 3-, and 5-year survival rates (AUC: 0.767–0.793). Therefore, our prediction model based on deep learning algorithms can more accurately predict the survival rate and survival period of medulloblastoma compared to other models.

for treatment strategies to improve patient survival rates 13,14 .Deep learning is a subfield of machine learning that involves discovering the distributed features of sample data by learning the underlying laws and levels of representation 15,16 .Neural networks are at the heart of deep learning algorithms and consist of input, hidden and output layers that can be used to solve complex, multi-factor and non-linear problems.Deep learning-based models have become highly effective predictors of clinical outcomes across various disease domains due to the continuous advancements in deep learning research techniques and the abundance of biomedical big data.Jiang et al. 17 demonstrated the use of an artificial neural network model to predict the survival rate of patients diagnosed with pancreatic neuroendocrine neoplasms, by leveraging clinical information.Katzman et al. 18 integrated deep learning with a multilayer neural network architecture, known as the DeepSurv model, resulting in a personalized treatment recommendation system that showed remarkable performance.
To our knowledge, there is a lack of research combining deep learning techniques with the study of medulloblastoma.Therefore, this study aimed to fill this research gap by utilizing data obtained from the Surveillance, Epidemiology, and End Results (SEER) database, which contains information on patients diagnosed with medulloblastoma in the United States.And then the DeepSurv model was used to evaluate their survival rates.

Method Data source and patient selection
The data of this retrospective cohort study were obtained from the SEER database, which encompasses information from 18 cancer registries representing approximately 28% of the entire US population 19 .This database offers extensive and detailed patient data, including demographic characteristics, tumor-related information, cause of death, and survival duration.The SEER*Stat software (version 8.3.6) was used to identify patients with medulloblastoma.The dataset covering the years 2000 to 2019 in the United States was accessed.
The patients included in the study had to meet the following criteria: (1) a confirmed pathological diagnosis of medulloblastoma; (2) identification of medulloblastoma cases based on the third edition of the International Classification of Diseases for Oncology (ICD-O3) using specific ICD-O-3 codes for histopathology, including 9,470/3 for medulloblastoma, NOS; 9,471/3 for desmoplastic nodular medulloblastoma; and 9,474/3 for large cell medulloblastoma.Furthermore, patients were required to have a known survival status and time.Afterwards, they were randomly divided into a training group and a testing group at a 7:3 ratio.A flowchart in Fig. 1 illustrates the process of patient selection.

Model development
This study selected three models for training: DeepSurv, RSF, and CoxPH.DeepSurv is a deep feedforward neural network used to predict patients' survival time or survival probability.It employs a multi-layer neural network to capture the complex nonlinear relationship between patients' survival probability and input features.This study utilized deep-learning calculations based on the DeepSurv calculation method described by Katzman et al. 18 to predict the survival outcome of patients diagnosed with medulloblastoma.The term RSF refers to Random Survival Forests, which is a survival analysis method based on random forests.When constructing a random survival forest, subsets of samples and features are randomly selected, and multiple decision trees are built using these subsets 20 .Each decision tree splits the samples based on features in the nodes and determines the optimal splitting based on the evaluation of survival time differences.The predictions from multiple decision trees in the random survival forests are combined to obtain the final survival prediction.The CoxPH is a semi-parametric regression model used to analyse survival data and estimate the risk of event occurrence.The Cox proportionalhazards model is used to compare the relative risks of events between different groups and study the impact of various factors on event occurrence.The model functions by modeling the relationship between time and event occurrence as a function of hazard ratios.
We performed hyperparameter tuning in the Deepsurv model using grid search and fivefold cross-validation on the training dataset, selecting the parameter with the highest average C-index in the cross-validation as the optimal parameter.
For the implementation of the algorithms in this research, CoxPH and RSF were implemented using the Python package "Scikit-learn (version 0.24.1)" and DeepSurv was implemented using the open-source Python package "Tensorflow-gpu (version 2.6.2)".www.nature.com/scientificreports/

Model evaluation
The study evaluated the model's performance using several metrics, including C-index, Brier score, integrated brier score (IBS), receiver operating characteristic (ROC) curves, and area under the curve (AUC) values.The C-index is a commonly used metric for evaluating the accuracy of survival predictions 21 .It measures the concordance or correlation between the predicted survival risk and the actual observed survival time.A C-index of 0.5 indicates random predictions, while a value of 1.0 indicates perfect predictions.The Brier score assesses the mean squared difference between the observed patient statuses (event occurrence or censoring) and the predicted survival probabilities.It ranges from 0 to 1, with 0 indicating a perfect match between predictions and observations.In practice, models with Brier scores less than 0.25 are considered useful 22,23 .The IBS is a metric that evaluates the overall performance of a survival model across all available time points 24 .It takes into account the model's sensitivity and specificity to time-dependent events, providing a comprehensive measure of predictive accuracy.Receiver Operating Characteristic (ROC) curves are frequently used to assess a model's sensitivity and specificity at various discrimination thresholds.The ROC curve plots the true positive rate against the false positive rate.The Area Under the Curve (AUC) values, which range from 0 to 1, are computed to quantify the overall performance of the model.A higher AUC indicates better discrimination ability.This study calculated AUC values to assess the model's performance at different time points: 1, 3, and 5-year survival rates.

Statistical analysis
In the clinical data, continuous variables are expressed as mean ± standard deviation (SD), while categorical variables are described using frequencies and percentages.Statistical tests such as chi-square tests and unpaired t-tests are used to compare variables between groups.
The predictive model was generated by partitioning the complete dataset into two mutually exclusive subsets.70% of the dataset was allocated for the training set, while the remaining 30% was used for the testing set.Model generation was performed on 1,625 randomly assigned patients from the training set, while the accuracy of the model was estimated using 697 randomly assigned patients from the testing set.No statistically significant differences in characteristics were found between the two groups (refer to Table 1).Additionally, survival outcomes showed no differences between the two groups (refer to Fig. S1).

Cox proportional-hazard (CoxPH) model
The CoxPH model was developed using the training set (refer to Fig. 3).Only variables that showed statistical significance in the univariate analysis were included in the multivariate analysis.The survival of medulloblastoma patients was significantly affected by non-surgical treatment, LC, white race, tumor size ≤ 3.4 cm, total resection, age > 3 years, chemotherapy, and radiotherapy.Furthermore, the survival of the patients was significantly associated with these features in the multivariate analysis.The collinearity analysis also revealed a high correlation between age and radiotherapy, as well as between chemotherapy and radiotherapy (refer to Fig. S2).Ultimately, we included seven features (age, race, tumor size, histological type, surgery, chemotherapy, and radiotherapy) in the model development.

Random survival forests (RSF)
Prediction error was calculated using the out-of-bag (OOB) from the training set (Fig. 4A).The predicted probability function for patient in the test cohort was plotted in Fig. 4B.Variable Importance (VIMP) is used to indicate the extent to which the sample characteristics contribute to the regression, as shown in Fig. 4C.A higher VIMP value indicates a greater influence or importance of that variable in accurately predicting the outcome 25 .The interaction between variables in the analyzed data is illustrated and displayed in Fig. 4D.If one variable's split in a decision tree affects or influences the split of another variable, it suggests an interaction between those variables 26,27 .The extent of interactions is assessed based on the minimum depth, which represents the distance from the root node to the node where the variable first splits.In this case, chemotherapy and radiotherapy were found to have the lowest minimum depth among the variables considered that were expected to be associated with other variables.www.nature.com/scientificreports/

DeepSurv
The hyperparameters of DeepSurv were tuned with reference to previous studies that grid search and fivefold cross-validation on the training dataset 18,28 .The model with the optimal set of hyperparameters achieved the accuracy of 91.06% and the corresponding R 2 value of 0.6455.The best combination of the model hyperparameters included 2000 epochs, the Adam optimizer, binary cross-entropy loss, four layers (nodes: 32, 64, 128, 256), a dropout rate of 0.2, and a learning rate of 0.001.Furthermore, the performance of the model was evaluated with the testing set.The loss function curve illustrates the relationship between the loss and the number of iterations, providing valuable information about the convergence and performance of the model 29 .In addition, the C-index is a commonly used metric for evaluating the performance of survival analysis models.If the C-index is only measured on the training set, the possibility of overfitting cannot be completely ruled out, as the model may over-fit the training data, leading to a decrease in generalization performance on test data 30 .In this study, the C-index was measured on two mutually exclusive data sets (training and test) and no overfitting phenomenon was observed.The learning process of DeepSurv, a survival prediction model based on deep learning, was visualized (Fig. 5).The figure shown a good model fit, indicating that the model was effectively learning and capturing the underlying patterns in the data.

Discussion
Medulloblastoma, a malignant brain tumor that mainly impacts children, continues to pose a substantial obstacle in the field of pediatric oncology.Precisely predicting the individual prognosis of patients is crucial for customizing treatment approaches and enhancing survival rates.Prior research has identified several prognostic factors that affect the survival duration of medulloblastoma patients, including age, extent of surgical removal, and the administration of radiotherapy or chemotherapy 7,31,32 .Moreover, as medical advancements progress, an increasing amount of imaging data 5 and genetic data 33 are being analyzed for survival analysis of medulloblastoma patients.However, classical survival analysis methods, such as the Cox proportional-hazards model, assume a linear relationship between variables, which may be limited in the face of multidimensional data.With the advancement of artificial intelligence, machine learning methods are being applied to clinical, imaging, and genetic data, allowing for the discovery of potential nonlinear relationships within the data [34][35][36] .Within machine learning, deep learning is a specific class of methods that utilizes multilayered neural networks to extract high-order features.Deep learning has gained increasing popularity in the field of cancer survival analysis, and has demonstrated excellent performance [37][38][39] .As far as we know, this approach has not been applied to medulloblastoma.Therefore, we applied a deep learning model (Deepsurv) to predict the overall survival (OS) of medulloblastoma patients and compared its performance to that of a machine learning model (RSF) and a classical model (CoxPH).
By extracting potentially significant features from the SEER database, this research developed multiple models to forecast the survival rates of individuals diagnosed with medulloblastoma.Initially, we utilized the X-tile tool to determine the optimal cutoff values for age and tumor size from a cohort of 2,322 medulloblastoma patients.We identified two high-risk factors, age ≤ 3 years old and tumor size > 3.4 cm, that significantly impact the survival duration of patients with medulloblastoma.Subsequently, we employed Cox proportional hazards regression to identify variables associated with the prognosis of medulloblastoma patients.Age, race, tumor size, histological type, surgery, chemotherapy, and radiotherapy were selected for inclusion in the modeling process (p < 0.05).We established RSF, DeepSurv and CoxPH models and evaluated their performance using metrics such as the C-index, IBS, and ROC curve.The study results demonstrated that the DeepSurv model outperformed both the CoxPH and RSF models, as indicated by its higher C-index in both the training and testing sets.Moreover, the DeepSurv model exhibited the lowest IBS and the largest AUC values when predicting 1-, 3-, and 5-year survival.These findings collectively suggest that the DeepSurv model is more accurate in predicting the survival of patients with medulloblastoma.In previous studies, Guo et al. 7 and Zhou et al. 5 utilized Cox proportional hazard regression for survival analysis of medulloblastoma and developed a nomogram.Compared with their study, the C-index values obtained from the DeepSurv model were higher in both the training and the testing cohort, indicating its superior predictive accuracy of the prognosis of patients with medulloblastoma.This finding is consistent with the results reported in several previous studies focusing on cancer prognosis 40,41 .The main advantage of the DeepSurv model is its ability to handle both linear and non-linear predictive variables using a multi-layer neural network.It has a powerful ability to capture arbitrarily complex non-linear interactions in the data, allowing such models to discover correlations that are difficult for the human eye or traditional statistical techniques to detect.
Nevertheless, our study encountered several limitations.Firstly, the data collected from the SEER database for medulloblastoma patients contain some missing information that may affect survival outcomes, including  important details such as molecular subgroups, specific radiotherapy doses, and chemotherapy regimens.Among other things, molecular diagnostics are critical for treatment and prognosis prediction of tumours, especially medulloblastoma.Nevertheless, the availability and completeness of these data depend on continuous improvements in data collection in the SEER database.Secondly, our model has yet to undergo external validation, and it is necessary to validate its performance on new data.Conducting further validations using independent datasets would enhance the reliability and generalizability of the findings.Another inherent limitation lies within the DeepSurv model itself.Due to its utilization of hidden layers in its architecture, the model operates as a black-box, making it challenging to fully comprehend the computations involved in the model construction process and its associated limitations.Future research should aim to address these concerns and explore the inner workings of the model to improve interpretability.

Conclusions
This study employed Cox proportional hazards regression analysis to examine the prognostic factors influencing medulloblastoma patients' outcomes, which include age, race, tumor size, histological type, surgery, chemotherapy, and radiotherapy.Subsequently, we developed a groundbreaking DeepSurv prediction model, which exhibited strong predictive capabilities in assessing the prognosis of patients diagnosed with medulloblastoma.This innovative DeepSurv model holds significant potential in accurately predicting the survival duration of medulloblastoma patients.

Figure 1 .
Figure 1.Study profile and analysis pipeline.Patients with a diagnosis of medulloblastoma as primary tumor in the SEER database 2000-2019 with complete follow-up data.The entire dataset was divided 7:3 into training (n = 1,625) and test (n = 697) sets.The CoxPH and RSF models were constructed directly from the training set data.When constructing the Deepsurv model, we used grid search and fivefold cross-validation for hyperparameter tuning on the training dataset.Finally, the performance of the models was evaluated in the testing set (n = 697) using several metrics.

Figure 2 .
Figure 2. The X-tile analysis was conducted to determine the best cutoff points for the variables of age and tumor size.(A) X-tile plot of training sets in age.(B) The cutoff point highlighted using a histogram of the entire cohort.(C) Kaplan-Meier plot showing the distinct prognosis determined by the cutoff point.(D) X-tile plot of training sets in tumor size.(E) The cutoff point highlighted using a histogram.(F) Kaplan-Meier plot showing the prognosis determined by the cutoff point.The low subset is depicted in gray, while the high subset is shown in blue. https://doi.org/10.1038/s41598-024-65367-9

Figure 4 .
Figure 4. Random survival forest model.8 features were used to construct the model: sex, age, radiotherapy, chemotherapy, histopathology, race, surgery, tumor size.(A) Out-of-bag (OOB) error rate.(B) Predicted survival curves generated for testing set.(C) Variable importance plot.Higher values of Variable Importance (VIMP) indicate the variable contributes more to predictive accuracy of the model.(D) Variable interaction plot.Lower values indicate a higher level of interactivity between the variables.

Figure 5 .
Figure 5.The training and testing history of DeepSurv.(A) A plot of loss on the training and testing sets.The error gradually decreases over each iteration during training.(B) A plot of the concordance index obtained by the model in the train and test sets as a function of the epochs.It is neither fitting the training data too well nor failing to capture important patterns in the data.

Table 1 .
Characteristic distribution of data into raining sets and test sets.

Table 2 .
Performance of three survival models.Significant values are in bold.